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(57) Requests for objects are received from one or 
more clients in a system comprised of a plurality of 
nodes. One of the requests is sent from one of the cli- 
ents. The request is received from a first node of said 
plurality of nodes by a second node of said plurality of 
nodes. A requested object is returned from the second 
node of the plurality of nodes using one of a plurality of 
protocols. The method may be applied to a scalable and 
highly available cache array The cache array may en- 
hance the performance and throughput of Web servers 
connected to a network such as the Internet. A network 
dispatcher may send requests to a cache node of a 
cache array. The cache node selected by the network 
dispatcher may either serve the request, handoff the re- 
quest to another cache node of a cache array, or com- 
municate via one of a variety of protocols with another 
cache node of the cache array to cooperatively serve 
the request. A network dispatcher, operating in a special 
mode, may also function as a content based router. 
Thus, it is also possible for a network dispatcher to route 
requests using a combination of content-based and 
non-content-based routing in order to further improve 
system performance. 
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Description 

[0001] The present invention relates in general to 
processing of requests for objects in a system compris- 
ing a plurality of nodes. In particular, the present inven- 
tion relates to a scalable and highly available cache in 
" computer networks. Furthermore, the present invention 
finds use in an array of caches used in a client/server 
environment such as, in particular, the World Wide Web. 
[0002] Caching is commonly used for improving per- 
formance on computer systems. Once an object is 
stored in a cache, subsequent requests for the cached 
object may be satisfied by the cache. Satisfying re- 
quests for an object from a cache may incur less over- 
head than regenerating or retrieving the object from a 
remote location. Slow performance coupled with a grow- 
ing demand for Web services, may cause Web servers 
to become inefficient or unusable. 
[0003] Caching oflers a methodology for dealing with 
growing demands for greater throughput for Web and 
Proxy servers. Systems of clients and servers on the 
World Wide Web : for example, may use caching to im- 
prove performance. In some instances, Web server ap- 
plications may perform slowly and inefficiently without 
the benefit of a cache. Without the benefit of a caching 
scheme, Web servers may become a system bottle- 
neck. The underlying operating system running on a 
Web server, for example, may have performance prob- 
lems impeding the throughput of the Web server. One 
technique for improving the performance of Web serv- 
ers is to store frequently requested data (e.g. Web pag- 
es) in a cache. Retrieving data from the cache may re- 
quire less overhead than retrieving the data from the 
Web server. 

[0004] According to a first aspect of the invention 
there is provided a method of processing requests for 
objects from one or more clients in a system including 
a plurality of nodes, the method including steps of: re- 
ceiving a request for an object at a first of said nodes; 
on a determination that the first node is not an owner of 
the requested object; and retrieving the requested ob- 
ject via the second node using one of a plurality of pro- 
tocols. 

[0005] In a preferred system, the nodes each include 
at least one of a plurality of caches. In such a system, 
the step of receiving the request by another one of the 
plurality of nodes is performed ether (a) in response to 
the one node not having a cached copy of the requested 
object, or (b) in response to the one node not being an 
owner of the requested object. 

[0006] According to yet another aspect of the inven- 
tion there is provided, in a system adapted to receive 
requests for objects from one or more clients, the sys- 
tem comprised of a plurality of nodes, a method for re- 
trieving a requested object of the objects, the method 
comprising the steps of: (a) sending one of the requests 
from one of the clients; (b) receiving the request from 
one node of the plurality of nodes by another node of 
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the plurality of nodes; and (c) returning the requested 
object from the another node of the plurality of nodes 
using one of a plurality of protocols. 
[0007] According to yet another aspect of the inven- 

5 tion there is provided a method of retrieving a requested 
object, said method comprising the steps of: (a) trans- 
mitting a request for the requested object to a first cache; 
(b) determining whether the first cache corresponds to 
the requested object; (c) identifying a second cache cor- 

fo responding to the requested bbject; and (d) retrieving 
the requested object via the second cache using one of 
a plurality of protocols. 

[0008] According to yet further aspects of the inven- 
tion, program storage devices are defined including in- 
is structions executable by a machine for performing the 
steps of the various methods defined in the appended 
claims. 

[0009] Embodiments of the invention will now be de- 
scribed, by way of example only, with reference to the 
20 accompanying drawings in which: 

Fig. 1 (a) is a block diagram is of an exemplary com- 
puter network system in accordance with an em- 
bodiment of the present invention; 

25 

Fig. 1 (b) is a flowchart diagram of a method for re- 
trieving a requested object in accordance with an 
exemplary embodiment of the present invention; 

30 Fig. 2 is a block diagram which illustrates a method 

for retrieving a requested object in case of a cache 
member hit and a cache array hit in accordance with 
an exemplary embodiment of the present invention; 

35 Fig. 3 is a block diagram which illustrates a method 
for retrieving a requested object in case of a cache 
member hit and a cache array miss in accordance 
with an exemplary embodiment of the present in- 
vention; 

40 

Fig. 4 through Fig. 7 are block diagrams which illus- 
trate a method for retrieving a requested object in 
case of a cache member miss in accordance with 
an exemplary embodiment of the present invention; 
is and 

Fig. 8 is a block diagram which illustrates a method 
for retrieving a requested object in the case of con- 
tent based routing in accordance with an exemplary 
50 embodiment of the present invention. 

[0010] The following terms are used in description. 
While dictionary meanings are also implied by terms 
used herein, the following definitions may also be help- 

55 ful. 
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Client 

[0011] A client computer which typically issues com- 
mands and/or requests to one or more servers which 
perform the task associated with the command and/or s 
request. 

Server 

[0012] A server is a computer which performs a task 10 
at the command and/or request of one or more client 
computers. 

World Wide Web (Web) 

15 

[0013] An Internet service that links documents by 
providing hyperlinks from server to server. Users may 
"jump" from document to document by clicking on high- 
lighted words or phrases of interest (hypertext links), 
click on graphics such as applets or image maps, fill in 20 
forms, and enter URLs. User's may "jump" from docu- 
ment to document no matter where the documents are 
stored on the Internet. Internet Web servers support cli- 
ents and provide information. Users "browse the Web" 
using Web client programs The web may be considered 2s 
as the Internet with resources addressed by URLs using 
HTTP protocols to transfer information between com- 
puters. HTML (among other formats) may be used to 
display information corresponding to URLs and provide 
point-and-click interface to other URLs. 30 

Universal Resource Locator (URL) 

[0014] An address for a resource on the Internet. 
URLs are used by Web browsers to locate Internet re- 35 
sources. A URL specifies the protocol to be used in ac- 
cessing the resource (such as http: for a World Wide 
Web page or ftp: for an FTP site), the name of the server 
on which the resource resides (such as //www. white- 
house, gov), and, optionally, the path to a resource (such 40 
as an HTML document or a file on that server). 

Hyper Text Markup Language (HTML) 

[0015] The markup language used for documents on 45 
the World Wide Web. HTML is an application of SGML 
(Standard Generalized Markup Language) that uses 
tags lo mark elements, such as text and graphics, in a 
document to indicate how Web browsers should display 
these elements to the user and should respond to user so 
actions such as activation of a link by means of a key 
press or mouse click. HTML is a device-independent 
representation of content on Web servers. Web servers 
may deliver content (including hyperlinks) to clients in 
HTML with confidence that the client will choose an ap- 55 
proprtate presentation. 



HyperText Transfer Protocol (HTTP) 

[0016] The client/server protocol used to access in- 
formation on the World wide Web. HTTP is an example 
of a stateless protocol. In other words, every request 
from a client to a server is treated independently Clients 
send requests to servers and servers respond using this 
protocol. 

User Datagram Protocol (UDP) 

[0017] A connectionless protocol within TCP/IP that 
corresponds to the transport layer in the I SO/OSI model. 
UDP converts data messages generated by an applica- 
tion into packets to be sent via IP but may not verify that 
messages have been delivered correctly Therefore, 
UDP may be more efficient than TCP, so it may be used 
for various purposes, including SNMP (Simple Network 
Management Protocol); the reliability may depend on 
the application that generates the message. 

Router 

[0018] An intermediary device on a communications 
network that expedites message delivery. On a single 
network linking many computers through a mesh of pos- 
sible connections, a router receives transmitted mes- 
sages and forwards them to their correct destinations 
over the most efficient available route. On an intercon- 
nected set of local area networks (LANs) using the same 
communications protocols, a router serves the some- 
what different function of acting as a link between LANs, 
enabling messages to be sent from one to another. 

Web Browser 

[001 9] A client application that enables a user to view 
HTML (or other) documents on the World wide Web, an- 
other network, or the user's computer; follow the hyper- 
links among them; and transfer files. 

Transmission Control Protocol (TCP) 

[0020] The protocol within TCP/IP that governs the 
breakup of data messages into packets to be sent via 
IP, and the reassembly and verification of the complete 
messages from packets received by IP. TCP corre- 
sponds to the transport layer in the I SO/OSI model. 

Internet Protocol (IP) 

[0021] The protocol within TCP/IP that governs the 
breakup of data messages into packets, the routing of 
the packets from sender to destination network and sta- 
tion, and the reassembly of the packets into the original 
data messages at the destination. IP corresponds to the 
network layer in the ISO/OSI model. 
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TCP/IP 

[0022] A protocol developed by the Department of De- 
fense for communications between computers. It is built 
into the UNIX system and has become the de facto 
standard for data transmission over networks, including 
the internet. 

Proxy Server 

[0023] A component that manages Internet traffic to 
and Irom a local area network (LAN) and can provide 
other features, such as document caching and access 
control. A proxy server can improve performance by 
supplying frequently requested data, such as a popular 
Web page, and can filter and discard requests that the 
owner does not consider appropriate, such as requests 
for unauthorized access to proprietary files. 

Cache 

[0024] A special memory subsystem in which fre- 
quently used data values are duplicated for quick ac- 
cess. 

Object 

[0025] An object is data that may be stored in a cache, 
server, or client. 

[0026] Fig. 1(a) illustrates an exemplary computer 
network system including: Clients 110, 120, and 130, 
network 150, network dispatcher 160, cache array 170, 
and server cluster 180. Cache array 170 includes cache 
nodes 172, 174, 176, and 178. Server cluster 180 in- 
cludes servers ("back-end" servers) 182, 184, 186, and 
1 88. Client computers 110,1 20, and 1 30 issue requests 
for objects such as, for example, Web pages. 
[0027] Fig. 1 (b) is a flowchart diagram of a method for 
retrieving a requested object in accordance with an ex- 
emplary embodiment of the present invention. In step 
190 a client, for example client 110, requests an object 
which may be stored (or generated or fetched by) a serv- 
er, for example server 182. in step 192 the request for 
an object is received via a network, for example network 
150, by a network dispatcher, for example network dis- 
patcher 160. In step 1 94 network dispatcher 160 routes 
the request for the object to one cache node, for exam- 
ple cache node 172. Instep 196 a determination is made 
whether cache node 172 receiving the request for the 
object from network dispatcher 160 in step 1 94, is a pri- 
mary owner of the object. If cache node 172 : receiving 
the request for the object from network dispatcher 160 
in step 194, is the primary owner, then cache node 172 
may service the client's request in step 1 97. Otherwise, 
in step 1 98 cache node 1 72 and the primary owner may 
either function as a proxy (e.g. through an HTTP or a 
UDP interface) to service the request and retrieve the 
requested object or cache node 172 may handoff the 
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request to the primary owner. In steps 197 and 1 98 pri- 
mary owner retrieves the requested object either from 
cache memory or by communicating with server 182. 
Note that one skilled in the art will understand that more 
s than one cache node may be a primary owner of an ob- 
ject. 

[0028] Clients 110, 120, and 130 may be worksta- 
tions, personal computers, or other computers connect- 
ed to the Internet. For example, a user using a personal 

io computer at home may request to retrieve and view a 
Web page by inputting a corresponding URL using a 
Web browser. The requested Web page, addressed by 
the URL, may belong to a server accessible through 
Web services on the Internet. 

is [0029] Cache array 1 70 may be one or more network 
nodes. Each node included in cache array 170 may be 
one or more processors. Each processor of each node 
of cache array 170 may include a set of one or more 
cache members (cache nodes) which may form a single 

20 cache space and a single cache image. In other words, 
a client may view a cache array as a single image. For 
example, a client may access cache array 170 via an 
address associated with network dispatcher 160, but 
each node of cache array 1 70 may have an independent 

2S |p address. Internally a cache array may combine the 
resources of individual cache members. For example, 
cache space of cache members may be combined to 
scale the memory space available for caching. Further, 
the individual throughput of cache members may also 

30 be combined to scale the available throughput. Cache 
members 1 72, 1 74, 1 76, and 1 78 may each be address- 
able, internal to cache array 170, by a distinct address 
(e.g. an IP address). Cache nodes (members) 172, 174, 
176, and 178 may be implemented, for example, by a 

3S router. A router such as, for example, an IBM 22XX fam- 
ily router may be used. 

[0030] Network dispatcher 1 60 may be implemented, 
for example, on a TCP router, when network dispatcher 
160 receives a request from a client such as, for exam- 

40 pie, client 110, network dispatcher may route the re- 
quest to a cache node of cache array 1 70 without "look- 
ing" at the request. Network dispatcher 160 may be, for 
example, a service of a router node used to route client 
requests for Web pages (or other objects) to cache array 

*s 170. Network dispatcher 160 may obtain availability as 
well as load information about cache nodes 172, 174, 
176, and 178 of cache array 170. Network dispatcher 
160 may also route and transmit requests for objects to 
a selected cache node, based on the availability of 

so cache nodes and/or load information (see, for example, 
G. Hunt et al., "Network Dispatcher: a connection router 
for scalable Internet Services," in Proceedings of the 7th 
International World Wide Web Conference, April 1998). 
[0031] A correspondence may be formed between 

ss cache nodes and requested objects. A requested object 
may correspond to a cache node which is a primary 
owner of the requested object as well as to other cache 
nodes. For example, Internet addresses expressed by 
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URLs may be partitioned amongst Web cache nodes. 
For each URL, one Web cache node may be assigned 
as a primary owner. One or more Web cache nodes in 
addition to a primary owner may also correspond to a 
single URL. The URLs may be partitioned amongst Web 
cache nodes using, tor example, a hashing function. In 
other words, a hashing function may be used to form a 
correspondence between cache nodes and requested 
objects. 

[0032] When client 110, 120, or 130 issues a request 
to retrieve an object (i.e. a target object), the request is 
routed and transmitted by network dispatcher 160 to a 
first cache of cache array 170. A first cache selected by 
network dispatcher 160 may or may not correspond to 
the requested object. In other words, the requested ob- 
ject may or may not be assigned to the first cache. If the 
first cache selected by network dispatcher 160 corre- 
sponds to the requested object (i.e. it is the primary own- 
er of the requested object) and the requested object is 
stored in the first cache, then the first cache may service 
the request. If the first cache selected by network dis- 
patcher 160 corresponds to the requested object (i.e. it 
is the primary owner of the requested object) and the 
requested object is not stored on the first cache, then 
the first cache may retrieve the requested object from 
an appropriate server 182, 184, 186, or 188 of server 
cluster 180. 

[0033] For example, suppose a user, using a Web 
browser on a client, requests to retrieve and view a Web 
page. The requested Web page may be addressed by 
a URL and stored in a server connected to a network. 
Network dispatcher 160 selects a first cache node of 
cache array 170. Network dispatcher 160 routes the re- 
quest for the Web page to the first cache node. If the 
requested Web page is stored in the first cache node, 
then the first cache node may retrieve the Web page 
from cache memory, and return the Web page to the cli- 
ent. If the first cache node is the primary owner of the 
requested Web page, but the Web page is not stored in 
the first cache node, then the first cache node may ac- 
cess the appropriate server, addressed by the URL, re- 
trieve the Web page from the server, and return the Web 
page to the client. 

[0034] If, however, the first cache selected by network 
dispatcher 160 does not correspond to the requested 
object (i.e. it is not the primary owner of the requested 
object), then the first cache may transmit the request to 
a second cache which does correspond to the request- 
ed object (i.e. the second cache is the primary owner of 
the requested object). The first cache and the second 
cache may communicate to service the request and re- 
trieve the requested object. Alternatively, the first cache 
may handoff the request to the second cache. In the 
case of a handoff the request may be transmitted from 
the first cache to the second cache of cache array 170 
along with information relating to a TCP connection (e. 
g. sequence numbers, IP addresses, and TCP ports). 
[0035] A decision on whether the first cache and the 



second cache will communicate to service a request, or 
whether the first cache will handoff the request to the 
second cache may be made, for example, based on the 
size of the requested object. 
5 [0036] For example, the following recipe may be fol- 
lowed: 

1 . If the size of the requested object is less than a 
threshold size, then the second cache transmits the 

io requested object to the first cache. The second 
cache retrieves the requested object either from 
cache memory or by communicating with an appro- 
priate server. The first cache may then service the 
request by transmitting the requested object to a re- 
questing client. 

2. If the size of the requested object is greater than 
a threshold size, then the first and second caches 
may coordinate to handoff the request: 

20 

(i) the TCP connection is handed-off from the 
first cache to the second cache; 

(ii) network dispatcher 160 is informed that the 
2S request will be serviced by the second cache 

(note that network dispatcher 160 may also be 
informed that further requests for an object are 
to be transmitted to the second cache); 

30 (iij) the second cache retrieves the requested 

object either from cache memory or by commu- 
nicating with an appropriate server, and trans- 
mits the requested object to a requesting client. 

35 Note that the threshold size may be adjusted or varied 
dynamically. 

[0037] In this type of exemplary coordination the sec- 
ond cache may have stored information fields useful for 
performing a TCP handoff (takeover) and/or an HTTP 

40 handoff (takeover). A first cache, initially receiving a re- 
quest from network dispatcher 1 60, may wait until a sec- 
ond cache (primary owner of the requested object) 
transmits the requested object to the first cache, or in- 
forms the first cache that the second cache will service 

45 the request. Thus, the type of exemplary coordination 
described above may be driven by a primary owner of 
a requested object. 

[0038] A cache member hit occurs when network dis- 
patcher 160 forwards a connection request to a first 

so cache which is the primary owner of a requested object. 
If, for example, network dispatcher 160 use a round-rob- 
in arrangement, the probability that a particular cache 
member will be selected may be uniformly distributed (i. 
e. probability 1/n, in the case of n cache members). In 

55 addition, network dispatcher 160 may access the ad- 
dressing information (e.g. URL) associated with the re- 
quested object, but at a substantial overhead cost. 
Thus, network dispatcher 160 may access sufficient in- 
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formation to identify cache array 170 associated with 
server cluster 180, and then choose a cache member 
randomly. If load and availability information is provided 
to network dispatcher 160 : the likelihood of selecting a 
particular cache may be weighted. In other words, if one 
cache is loaded, it may be selected less often, and the 
remaining cache members may be selected more olten. 
[0039] A cache array hit occurs when cache array 1 70 
is able to service a request for an object from the cache 
space of one of cache member 172, 174,176, 178. It is 
thus possible for a cache member hit to occur and cache 
array miss to occur simultaneously. For example, net- 
work dispatcher 160 may select a cache member which 
is the primary owner of the requested object, but the pri- 
mary owner of the requested object does not have the 
requested object in cache memory. Hence, in case of a 
cache member hit and cache array miss, the primary 
owner retrieves the requested object from a server. 
[0040] Further, a cache member miss and cache ar- 
ray hit may occur simultaneously. For example, network 
dispatcher 1 60 may select a cache member which is not 
the primary owner of a requested object, but the primary 
owner of the requested object, a cache member of 
cache array 170, does have the requested object in 
cache memory. 

[0041] Therefore, the following four cases may occur: 

1. Cache member hit, cache array hit 

2. Cache member hit, cache array miss 

3. Cache member miss, cache array hit 

4. Cache member miss, cache array miss 

[0042] In addition, different communication protocols 
may be used to retrieve a requested object. Different 
protocols may be used depending on whether a first 
cache and a second cache communicate to service a 
request, or whether a first cache hands-off a request to 
a second cache. For example, an HTTP interface 
(where a first cache may act as an HTTP proxy), a UDP 
based request, or a handoff may be used. 
[0043] The following eight cases may occur: 

1. Cache member hit, cache array hit. 

2. Cache member hit, cache array miss. 

3. Cache member miss, cache array hits, 

3.1 . object retrieved using HTTP, 

3.2. object retrieved using UDP, or 

3.3. object retrieved via a request handoff. 

4. Cache member miss, cache array miss, 

4. 1 . object retrieved using HTTP, 

4.2. object retrieved using UDP, or 

4.3. object retrieved via a request handoff. 

Cache member hit 

[0044] Figs. 2 and 3 are block diagrams which illus- 
trate a method for retrieving a requested object in case 
of a cache member hit in accordance with an exemplary 



embodiment of the present invention. Client 110 issues 
a request to retrieve an object via network 150. The re- 
quested object is stored on or may be generated or 
fetched by one of servers 1 82 and 1 84 of server cluster 

5 180. The request issued by client 110 is forwarded via 
network 150 to network dispatcher 160. Network dis- 
patcher 160 then selects a first cache, for example, 
cache member 172 of cache array 170. Supposing that 
first cache 172 selected by network dispatcher 160 is 

to the primary owner of the requested object, a cache 
member hit occurs. 

[0045] Fig. 2 illustrates the case where first cache 1 72 
has the requested object in cache memory (cache array 
hit). In this case, first cache 172 may retrieve the re- 
quested object from cache memory and transmit the re- 
quested object to client 110 via network 150. 
[0046] Fig. 3 illustrates the case where first cache 1 72 
does not have the requested object in cache memory 
(cache array miss). In this case, first cache 1 72 may first 

20 retrieve the requested object from server 182, and then 
transmit the requested object to client 110 via network 
150. System performance (e.g. overhead and through- 
put) for the case of cache member hit and cache array 
miss may decrease compared with the case of cache 

25 member hit and cache array hit. 

Cache member miss - HTTP interface 

[0047] Figs. 4 and 5 are block diagrams which illus- 
30 trate a method for retrieving a requested object in case 
of a cache member miss in accordance with an exem- 
plary embodiment of the present invention. Client 110 
issues a request to retrieve an object via network 1 50. 
The requested object is stored on or may be generated 
35 or fetched by one of servers 1 82 and 184 of server clus- 
ter 180. The request issued by client 110 is forwarded 
via network 1 50 to network dispatcher 1 60. Network dis- 
patcher 160 then selects a first cache, for example, 
cache member 172 of cache array 1 70. Supposing that 
40 first cache 172 selected by network dispatcher 160 is 
not the primary owner of the requested object, a cache 
member miss occurs. Note that in case network dis- 
patcher 160 randomly selects (with uniform distribution) 
a first member of cache array 170 (with n cache mem- 
4$ bers), the probability of a cache member miss is (n-1 )/n. 
[0048] Fig. 4 illustrates the case where a second 
cache, for example : cache member 176, is the primary 
owner of the requested object, and cache member 176 
has the requested object in cache memory (cache array 
50 hit). In this case, first cache 172 may accept a connec- 
tion with client 110. First cache 172 may then identify 
second cache 176 and establish an HTTP interface. 
First cache 172 and second cache 176 may communi- 
cate to retrieve the requested object via an HTTP inter- 
ns face. Second cache 1 76 may retrieve the requested ob- 
ject from cache memory and transmit the requested ob- 
ject to first cache 172. First cache 172 may then transmit 
the requested object to client 110 via network 150. 
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[0049] Fig. 5 illustrates the case where second cache 
176 does not have the requested object in cache mem- 
ory (cache array miss). In this case, second cache 176 
may first retrieve the requested object from server 184, 
and then transmit the requested object to first cache 
172. First cache 172 may then transmit the requested 
object to client 110 via network 150. 

Cache member miss - UDP interface 

[0050] Fig. 4 illustrates the case where a second 
cache, for example, cache member 176, is the primary 
owner of the requested object, and cache member 176 
has the requested object in cache memory (cache array 
hit). In this case, first cache 172 may accept a connec- 
tion with client 110. First cache 172 may then identify 
second cache 1 76 and establish an UDP interlace. First 
cache 1 72 and second cache 1 76 may communicate to 
retrieve the requested object via an UDP interface. Sec- 
ond cache 176 may retrieve the requested object from 
cache memory and transmit the requested object to first 
cache 172. First cache 172 may then transmit the re- 
quested object to client 110 via network 150. 
[0051] Fig. 5 illustrates the case where second cache 
176 docs not have the requested object in cache mem- 
ory (cache array miss). In this case, second cache 176 
may first retrieve the requested object from server 184, 
and then transmit the requested object to first cache 
172. First cache 172 may then transmit the requested 
object to client 110 via network 150. 
[0052] A UDP interface may have better performance 
than an HTTP interlace because a UDP interface may 
avoid a TCP connection having concomitant overhead. 

Cache member miss - handoff interface 

[0053] Figs. 6 and 7 are block diagrams which illus- 
trate a method for retrieving a requested object in case 
of a cache member miss in accordance with an exem- 
plary embodiment of the present invention. Client 110 
issues a request to retrieve an object via network 150. 
The requested object is stored on or may be generated 
or fetched by one of servers 182 and 184 of server clus- 
ter 180. The request issued by client 110 is forwarded 
via network 1 50 to network dispatcher 160. Network dis- 
patcher 160 then selects a first cache, for example, 
cache member 172 of cache array 170. Supposing that 
first cache 172 selected by network dispatcher 160 is 
not the primary owner of the requested object, a cache 
member miss occurs. 

[0054] Fig. 6 illustrates the case where a second 
cache, for example, cache member 176, is the primary 
owner of the requested object, and cache member 1 76 
has the requested object in cache memory (cache array 
hit). In this case, first cache 172 performs a handoff of 
the request (e.g. along with a TCP connection) to sec- 
ond cache 176. Second cache 176 may then retrieve 
the requested object from cache memory and transmit 



the requested object to client 110. 
[0055] Fig. 7 illustrates the case where second cache 
176 does not have the requested object in cache mem- 
ory (cache array miss). In this case, first cache 172 per- 

s forms a handoff of the request (e.g. along with a TCP 
connection) to second cache 176. Second cache 176 
may first retrieve the requested object from server 184, 
and then transmit the requested object to client 110. 
[0056] Cache array 1 70 may be equipped with several 

io features which support a handoff interface. First; all 
cache members of cache array 170 may be addressed 
using one cluster address (e.g. one IP address). Thus, 
first cache 172 and second cache 176 may both accept 
requests corresponding to the cluster address, and re- 

is spond to client 110. Second, a mechanism such as, for 
example, a TCP kernel extension, may be provided to 
allow a handoff from first cache 172 (using one TCP 
stack) to second cache 1 76 (using a second TCP stack). 
During the handoff, a UDP interface between first cache 

20 1 72 and second cache 1 76 may be used to transfer con- 
nection information. The use of a UDP interlace may 
help to improve performance. Third, along with TCP in- 
terface capabilities an additional mechanism for hand- 
ing-off HTTP requests from first cache 172 to second 

2S cache 176 may be implemented. Fourth, network dis- 
patcher 160 may be used to support TCP handoffs. 
When a TCP/HTTP connection is handednDff from first 
cache 172 to second cache 176, it may be desirable to 
ensure that the connection flow (from client to server) 

30 passes through second cache 176 (the new owner of 
the connection). Either first cache 172 or second cache 
176 may direct network dispatcher 160 to update its in- 
formation so that subsequent packets are sent to sec- 
ond cache 176, instead of first cache 172. 

35 [0057] Overhead for a handoff interface may be high- 
er than in the case of a UDP interface. In the case of a 
UDP or HTTP interface, however, performance may 
substantially decrease with increasing object size. Un- 
like the case of a UDP or HTTP interface, performance 

40 for a handoff interface may be less sensitive to increas- 
es in object size. The sensitivity of performance for a 
handoff interface to an increase in object size may be 
similar to the sensitivity of performance to an increase 
in object size for the case of a cache member hit. There- 

45 fore, performance may be improved by using a handoff 
interface for relatively large objects, and a UDP, HTTP, 
or other interface for relatively small objects. 

Cache member miss - a mixed model 

so 

[0058] The type of interface used (e.g. UDP or hand- 
off) may be chosen based on the size of a requested 
object. A size threshold may be determined such that 
improved performance may result by using one type of 
55 interface (e.g. UDP) for an object which is smaller than 
the size threshold, and another type of interface (e.g. 
handoff) for an object which is larger than the size 
threshold. A size threshold may be determined, for ex- 
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ample, from measuring the throughput of a given com- 
puter network and system. 

[0059] For example, a UDP interface may be chosen 
for requested objects which are smaller than 2 Kbytes, 
and a handoff interface may be chosen for requested 
objects which are larger than 2 Kbytes. Referring, for 
example, to Fig. 1(a), suppose client 110 requests an 
object which is stored on or may be generated or fetched 
by server 1 B6. Suppose further that cache member 178 
is the primary owner of the requested object. The fol- 
lowing exemplary recipe may be applied: 

1. First cache, say cache member 172, selected by 
network dispatcher 160, may send second cache 
178 a request and, for example, TCP connection 
information. 

2. Second cache 1 78 may determine the size of the 
requested object. The size of the requested object 
may be determined either from cache memory or by 
contacting server 186 in case the object is not in 
cache memory. 

3 . If the size of the requested object is less than a 
size threshold, second cache 178 may transmit the 
requested object to first cache 172. First cache 172 
may then transmit the requested object toclient 110. 

4. If the size of the requested object is greater than 
a size threshold, a handoff, for example a TCP/HT- 
TP handoff, between first cache 172 and second 
cache 178 may be performed. Second cache 178 
may then retrieve the requested object either from 
cache memory or from server 1 86, and transmit the 
requested object to client 110. 

Content Based Routing 

[0060] In this case, network dispatcher 160 of Fig. 1 
(a), may function as content based router 165 of Fig. 8. 
In particular, a network dispatcher such as ; for example, 
network dispatcher 160 of Figs. 2-7 operating in a spe- 
cial mode, may function as content based router 165. 
Addresses for requested objects (e.g. URLs) may be 
partitioned amongst cache nodes 172, 174, 176, and 
178 of cache array 170 using, for example, a hashing 
function. In other words, a hashing function may be used 
to form a correspondence between cache nodes and re- 
quested objects. The address of a requested object may 
be hashed. The output of the hashing function may be 
a cache node associated with one or more servers 
which may generate or fetch a requested object or on 
which a requested object is stored. In other words, the 
output of the hashing function may be a cache member 
that is the primary owner of a requested object. This is 
not necessarily to imply that there is a special relation- 
ship between a cache node and a server. Although, op- 
tionally, one such relationship may exist. 



[0061] When a client, such as clients 110, 120, and 
130, requests the retrieval of an object, content based 
router 165 may perform a handoff of the request along 
with, for example, TCP connection information, to a 

5 cache node selected by content based router 165. The 
cache node selected by content based router 165 is the. 
primary owner of the requested object. The selected 
cache node may then retrieve the requested object ei- 
ther from cache memory or from a server with which it 

10 is associated, and transmit the requested object to a re- 
questing client via network 150. 

[0062] As a network dispatcher, operating in a special 
mode, may function as a content based router, it is pos- 
sible for a network dispatcher to route requests using a 

75 combination of content-based and non-content-based 
routing. When content-based routing is used, network 
dispatcher 160 operating as content based router 165 
may examine a request sent by a client to determine 
which cache node is a primary owner of a requested ob- 

20 ject. As content based routing avoids cache member 
misses, content based routing may reduce processor 
cycles spent by the cache array. Content based routing, 
however, may increase consumption of processor cy- 
cles by network dispatcher 160 operating as content 

2S based router 165. More processor cycles may be con- 
sumed by network dispatcher 160 operating in content 
based routing mode because network dispatcher 160 
establishes a connection with a client. 
[0063] It may be advantageous to use content based 

30 router 165, as long as it does not become a system bot- 
tleneck. For example, a content based router imple- 
mented by an IBM 2216 router may not become a bot- 
tleneck as long as fewer than 10K requests/sec are 
made. If, however, more than 10K requests/sec are 

3S made (statically or dynamically) then it may be advan- 
tageous for an IBM 221 6 router to act as a network dis- 
patcher. Alternatively, a content based router having an 
integrated cache may be used. At relatively low and in- 
termediate system loads, a content based router may 

40 serve requests using an integrated cache. At relatively 
high system loads such a router may resume the func- 
tionality of a network dispatcher. 
[0064] For example, an I BM 221 6 router is capable of 
routing approximately 1 5,000 requests per second as a 

45 network dispatcher. An IBM 2216 router acting as a 
cache node is capable of serving approximately 5,000 
objects per second at nearly 100% cache rate. An IBM 
221 6 router acting as a content based router with hand- 
off is able to serve approximately 5,000/0.5 = 10,000 

so requests a second. 

[0065] In a variation of the above exemplary embod- 
iments of the present invention illustrated, for example, 
in Figs. 2-7, a certain percentage of requests routed by 
network dispatcher 160 to cache array 170 may be per- 

55 formed in content-based routing mode. Content-based 
routing may incurs less overhead on cache array 170 
but more overhead at network dispatcher 160. The per- 
centage of requests for which content-based routing is 
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used may be selected 1o balance utilization of network 
dispatcher and cache array resources. A further varia- 
tion may be to increase a percentage of requests which 
are routed using content-based routing mode in re- 
sponse to cache array 170 becoming a bottleneck. In 
response to network dispatcher 160 becoming a bottle- 
neck: however, the percentage of requests which are 
routed using content-based routing mode may be de- 
creased. 

[0066] Note that, optionally, the functionality of a net- 
work dispatcher, such as, for example network dispatch- 
er 1 60 of Fig. 1 (a), may be integrated into a cache node. 
[0067] One skilled in the art may conceive of adapta- 
tions of the above exemplary embodiments. For exam- 
ple, a client request need not be directed by a first node 
to a primary owner of a partition to which a requested 
object belongs. If a primary owner is overloaded, a re- 
quest may be sent to a second node. A second node 
may return a requested object to a first node, or a sec- 
ond node may return a requested object directly to a cli- 
ent using handoff. Optionally, a request may be handed- 
off to a server node. 

[0068] Although illustrated and described herein with 
reference to certain exemplary embodiments, the 
present invention is nevertheless not intended to be lim- 
ited to the details shown. Rather, various modifications 
may be made without departing from the scope of the 
invention as defined by the appended claims. 

Claims 

1. A method for processing requests for objects from 
one or more clients in a system including a plurality 
of nodes, the method including the steps of: 

receiving a request for an object at a first of the 
nodes; 

on a determination that the first node is not an 
owner of the requested object, identifying a 
second node that is an owner of the requested 
object; and 

retrieving the requested object via the second 
node using one of a plurality of protocols. 

2. The method of claim 1, wherein the one of the plu- 
rality of protocols is selected based on the size of 
the requested object. 

3. The method of claim 1 or claim 2, wherein in re- 
sponse to the size of the requested object being 
less than a predetermined size the requested object 
is returned from the second node to the first node 
and subsequently from the first node to the request- 
ing client. 



4. The method of claim 1 or claim 2, wherein in re- 
sponse to the size of the requested object being 
greater than a predetermined size the requested 
object is returned from the second node to the re- 

5 questing client without first passing through the first 

node. 

5. The method of any preceding claim, wherein the 
first and second nodes each include at least one of 

io a plurality of caches. 

6. The method of any preceding claim, wherein one of 
the plurality of nodes functions as a back-end serv- 
er. 

75 

7. The method of claim 6, further comprising the step 
of sending the request to the back-end server in re- 
sponse to the second node not having a cached 
copy of the requested object. 

20 

8. The method of any preceding claim, wherein the 
one of the plurality of protocols is one of HTTP (Hy- 
perText Transfer Protocol), UDP (User Datagram 
Protocol), and Handoff interface. 

25 

9. A method of retrieving a target object of a plurality 
of objects, the target object requested by a client, 
said method comprising the steps of: 

30 assigning each of the plurality of objects to at 

least one of a plurality of nodes; 

assigning at least one of a plurality of caches 
to each of the plurality of nodes; 

35 

transmitting a request for the target object to a 
first node of the plurality of nodes; 

determining if the target object is assigned to 
40 the first node; and 

if the target object is not assigned to the first 
node transmitting the request to a second node 
of the plurality of nodes, the target object being 
45 assigned to the second node : and 

retrieving the target object from a cache of the 
plurality of caches assigned to the second node 
if the target object is stored in the cache as- 
50 signed to the second node using one of a plu- 

rality of protocols. 

10. The method according to claim 9, wherein the step 
of retrieving the target object is performed by com- 

55 municating via at least one of the plurality of proto- 
cols. 

11. The method according to claim 10, wherein a first 
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protocol of the plurality of protocols is executed be- 
tween the first node and the second node, and a 
second protocol of the plurality of protocols is exe- 
cuted between the first node and the client. 

5 

12. The method according to claim 10, wherein a first 
protocol of the plurality of protocols is executed be- 
tween the second node and the client. 

13. A program storage device readable by machine, io 
tangibly embodying a program of instructions exe- 
cutable by the machine to perform method steps for 
processing requests for objects from one or more 
clients in a system including a plurality of nodes, the 
method steps comprising: is 

receiving a request for an object at a first of said 
nodes; 



one or more clients, the system including: 
a plurality of nodes; 

means for receiving a request for an object at 
a first of said nodes; 

means for determing whether the first node is 
an owner of the requested object; 

means responsive to a negative determination 
for identifying a second node within the system 
that is an owner of the requested object; and 

means for retrieving the requested object via 
the second node using one of a plurality of pro- 
tocols. 



on a determination that the first node is not an 20 
owner of the requested object, identifying a 
second node that is an owner of the requested 
object; and 

retrieving the requested object via the second 25 
node using one of a plurality of protocols. 

14. A method for routing requests between nodes in a 
system comprised of a plurality of nodes, the meth- 
od comprising the steps of: 30 

(a) selectively sending one of the requests to 
at least one of the nodes based on the content 
of the request; 

35 

(b) selectively sending a request of the re- 
quests to any one of the nodes independent of 
the content of the request; and 

(c) varying a frequency with which steps (a) and 40 
(b) are performed in order to improve perform- 
ance of the system. 

1 5. The method of claim 1 4, wherein the frequency with 
which step (a) is performed is increased in order to 45 
reduce use of system resources and improve sys- 
tem performance. 

1 6. The method of claim 1 4, wherein the frequency with 
which step (b) is performed is increased in order to so 
reduce use of system resources and improve sys- 
tem performance. 

1 7. The method of claim 1 4, wherein step (a) further in- 
cludes handing off a connection from one of the 55 
nodes to another of the nodes. 



18. A system for processing requests for objects from 
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